Load necessary packages and define directories.
Load Gross Primary Productivity (GPP) MODIS (2000–2022) and ERA5 (1997–2020) climate metrics computed in Earth Engine and convert to long format. These metrics have been spatially aggregated to county boundaries as depicted by Figure 4 in (Kenduiywo, Ghosh, Hijmans, & Ndungu, 2020).
Format the date variable by removing unnecessary characters.
| County | date | gpp |
|---|---|---|
| Baringo | 2000-02-26 | 130.1147 |
| Bomet | 2000-02-26 | 364.8841 |
| County | date | temp |
|---|---|---|
| Baringo | 1997-01-01 | 297.4850 |
| Bomet | 1997-01-01 | 292.9671 |
| County | date | precip |
|---|---|---|
| Baringo | 1997-01-01 | 0.0001701 |
| Bomet | 1997-01-01 | 0.0008220 |
Temporally aggregate the spatial metrics per year to obtain spatial-temporal climate metrics. Event of conflicts from violence against civilians will also be aggregated per year and county.
gpp$date <- format(as.Date(gpp$date, format="%d/%m/%Y"),"%Y")
gpp <- aggregate(.~County+date, data=gpp, mean, na.rm=T)
tmp$date <- format(as.Date(tmp$date, format="%d/%m/%Y"),"%Y")
tmp <- aggregate(.~County+date, data=tmp, mean, na.rm=T)
prec$date <- format(as.Date(prec$date, format="%d/%m/%Y"),"%Y")
prec <- aggregate(.~County+date, data=prec, mean, na.rm=T)
con <- readRDS("civilianViolence.rds")
con <- subset(con@data, select= c(ADMIN1, FATALITIES, YEAR))
names(con)[1] <- "County"
names(con)[3] <- "date"
con <- aggregate(.~County+date, data=con, sum, na.rm=T)
Let’s make some plots and evaluate trends. First merge each jmetric with conflict data. To do so check for county naming consistencies.
con$County <- toupper(con$County)
gpp$County <- toupper(gpp$County)
tmp$County <- toupper(tmp$County)
prec$County <- toupper(prec$County)
c <- sort(unique(con$County))
c[!c %in% sort(unique(gpp$County))]
## [1] "ELGEYO MARAKWET" "MURANGA"
c[!c %in% sort(unique(tmp$County))]
## [1] "ELGEYO MARAKWET" "MURANGA"
c[!c %in% sort(unique(prec$County))]
## [1] "ELGEYO MARAKWET" "MURANGA"
Edit the inconsistencies on the identified counties.
con$County[con$County == "MURANGA"] <- "MURANG'A"
con$County[con$County == "ELGEYO MARAKWET" ] <- "ELGEYO-MARAKWET"
We can also compute anomalies for these climate metrics. The Z-score is a good measure to capture anomalies. Z-scores indicate the deviations of seasonal/annual metrics from its long-term annual mean. The unit is “standard deviations”. This a z-score of -1 means that the value is 1 standard deviation below the (expected) mean value i.e.;
zscore <- function(y){
(y - mean(y, na.rm=TRUE) ) / (sd(y, na.rm=TRUE))
}
Compute the zScored metrics per year per Country
gpp <- ungroup(mutate(group_by(gpp,County), zgpp=zscore(gpp)))
tmp <- ungroup(mutate(group_by(tmp,County), ztemp=zscore(temp)))
prec <- ungroup(mutate(group_by(prec,County), zprecip=zscore(precip)))
con <- ungroup(mutate(group_by(con,County), zFATALITIES=zscore(FATALITIES)))
Make trend plots.
x <- aggregate(FATALITIES~date, data=con, mean, na.rm=T)
y <- aggregate(temp~date, data=tmp, mean, na.rm=T)
z <- aggregate(precip~date, data=prec, mean, na.rm=T)
k <- aggregate(gpp~date, data=gpp, mean, na.rm=T)
par(mfrow = c(2, 2), mar = c(4, 5, 1, 1)) #c(bottom, left, top, right)
plot(FATALITIES~date, data=x, type="l", col="red", ylab="# Fatalities", xlab= "Year", cex.axis =1.2, cex.lab = 1.2)
plot(gpp~date, data=k, type="l", col="red", ylab="GPP", xlab= "Year", cex.axis =1.2, cex.lab = 1.2)
plot(precip~date, data=z, type="l", col="red", ylab="Precipitation", xlab= "Year", cex.axis =1.2, cex.lab = 1.2)
plot(temp~date, data=y, type="l", col="red", ylab="2m Air Tempetature", xlab= "Year", cex.axis =1.2, cex.lab = 1.2)

As we proceed let’s merge all the datasets to one
data.frame
df_list <- list(con, tmp, gpp, prec)
dff <- Reduce(function(x, y) merge(x, y, by=c("County","date")), df_list)
Make regression plots to identify relationship between conflicts and various climate EO metrics.
library(ggplot2)
library(gridExtra)
##
## Attaching package: 'gridExtra'
## The following object is masked from 'package:dplyr':
##
## combine
aa <- melt(subset(dff, select=-c(temp, gpp, precip, FATALITIES)), variable.name = "metric_name", value.name = "metric", id.vars = c("County", "date", "zFATALITIES"))
ggplot(aa, aes(metric, zFATALITIES)) +
geom_point() +
geom_smooth(method="loess") +
facet_wrap(~ metric_name)
## `geom_smooth()` using formula 'y ~ x'
## Warning: Removed 6 rows containing non-finite values (stat_smooth).
## Warning: Removed 6 rows containing missing values (geom_point).

#Explore other plots with no anomalies
p1 <- ggplot(dff, aes(gpp, FATALITIES)) +
geom_point() +
geom_smooth(method = "loess")
p2 <- ggplot(dff, aes(temp, FATALITIES)) +
geom_point() +
geom_smooth(method = "loess")
p3 <- ggplot(dff, aes(precip, FATALITIES)) +
geom_point() +
geom_smooth(method = "loess")
grid.arrange(p1, p2, p3, ncol = 2)
## `geom_smooth()` using formula 'y ~ x'
## `geom_smooth()` using formula 'y ~ x'
## `geom_smooth()` using formula 'y ~ x'

Seems like precipitation is a good indicator of conflicts. For instance, conflicts are high when precipitation is low.
We can conduct spatial regression to evaluate vulnerable areas. For this Geographically weighted regression (GWR) is adopted. GWR is an exploratory technique mainly intended to indicate where non-stationarity is taking place on the map, that is where locally weighted regression coefficients move away from their global values. Its basis is the concern that the fitted coefficient values of a global model, fitted to all the data, may not represent detailed local variations in the data adequately - in this it follows other local regression implementations. It differs, however, in not looking for local variation in ‘data’ space, but by moving a weighted window over the data, estimating one set of coefficient values at every chosen ‘fit’ point. The fit points are very often the points at which observations were made, but do not have to be. If the local coefficients vary in space, it can be taken as an indication of non-stationarity.
library(raster)
##
## Attaching package: 'raster'
## The following object is masked from 'package:dplyr':
##
## select
ken <- getData("GADM", country="KEN", level=1)
## Warning in getData("GADM", country = "KEN", level = 1): getData will be removed in a future version of raster
## . You can use access these data, and more, with functions from the geodata package instead
ken$NAME_1 <- toupper(ken$NAME_1)
names(ken)[4] <- "County"
ken <- ken[,"County"]
Merge County boundaries with data.frame with Climate
metrics and conflict information.
#ken <- merge(ken, dff, by="County", duplicateGeoms = T)
df <- aggregate(.~County, data = dff[,-2], mean, na.rm=T)
ken <- merge(ken, df, by="County")
sp.na.omit <- function(x, margin=1) {
if (!inherits(x, "SpatialPointsDataFrame") & !inherits(x, "SpatialPolygonsDataFrame"))
stop("MUST BE sp SpatialPointsDataFrame OR SpatialPolygonsDataFrame CLASS OBJECT")
na.index <- unique(as.data.frame(which(is.na(x@data),arr.ind=TRUE))[,margin])
if(margin == 1) {
cat("DELETING ROWS: ", na.index, "\n")
return( x[-na.index,] )
}
if(margin == 2) {
cat("DELETING COLUMNS: ", na.index, "\n")
return( x[,-na.index] )
}
}
#shapefile(ken, "ken_conflicts.shp", overwrite=TRUE)
ken <- sp.na.omit(ken)
## DELETING ROWS: 41
Prior to running the GWR model we need to calculate a kernel bandwidth. This will determine now the GWR subsets the data when its test multiple models across space.
if (!"spgwr" %in% installed.packages()){
install.packages("spgwr", dependencies = T)
}
library(spgwr)
GWRbandwidth <- gwr.sel(FATALITIES ~ precip, data=ken, adapt=T)
## Adaptive q: 0.381966 CV score: 1922.11
## Adaptive q: 0.618034 CV score: 1999.398
## Adaptive q: 0.236068 CV score: 1872.682
## Adaptive q: 0.145898 CV score: 1857.621
## Adaptive q: 0.07619138 CV score: 1855.048
## Adaptive q: 0.08837297 CV score: 1861.617
## Adaptive q: 0.04708886 CV score: 1848.961
## Adaptive q: 0.02910252 CV score: 1804.984
## Adaptive q: 0.01798634 CV score: 1821.511
## Adaptive q: 0.02904693 CV score: 1804.821
## Adaptive q: 0.02541111 CV score: 1795.552
## Adaptive q: 0.0225751 CV score: 1790.406
## Adaptive q: 0.02082235 CV score: 1790.541
## Adaptive q: 0.02179189 CV score: 1789.297
## Adaptive q: 0.02172365 CV score: 1789.227
## Adaptive q: 0.02137939 CV score: 1789.417
## Adaptive q: 0.02162399 CV score: 1789.246
## Adaptive q: 0.02168296 CV score: 1789.232
## Adaptive q: 0.02172365 CV score: 1789.227
Run GWR model and view the results.
gwr.model = gwr(FATALITIES ~ precip, data=ken, adapt=GWRbandwidth, hatmatrix=TRUE, se.fit=TRUE)
## Warning in proj4string(data): CRS object has comment, which is lost in output; in tests, see
## https://cran.r-project.org/web/packages/sp/vignettes/CRS_warnings.html
gwr.model
## Call:
## gwr(formula = FATALITIES ~ precip, data = ken, adapt = GWRbandwidth,
## hatmatrix = TRUE, se.fit = TRUE)
## Kernel function: gwr.Gauss
## Adaptive quantile: 0.02172365 (about 0 of 46 data points)
## Summary of GWR coefficient estimates at data points:
## Min. 1st Qu. Median 3rd Qu. Max.
## X.Intercept. -4.1388 3.4140 8.6906 12.1235 46.1066
## precip -14490.4431 -1573.2499 -994.5646 -50.0824 3721.9219
## Global
## X.Intercept. 10.775
## precip -1076.530
## Number of data points: 46
## Effective number of parameters (residual: 2traceS - traceS'S): 27.21259
## Effective degrees of freedom (residual: 2traceS - traceS'S): 18.78741
## Sigma (residual: 2traceS - traceS'S): 5.666171
## Effective number of parameters (model: traceS): 20.68544
## Effective degrees of freedom (model: traceS): 25.31456
## Sigma (model: traceS): 4.881327
## Sigma (ML): 3.62113
## AICc (GWR p. 61, eq 2.33; p. 96, eq. 4.21): 334.4981
## AIC (GWR p. 96, eq. 4.22): 269.6121
## Residual sum of squares: 603.1789
## Quasi-global R2: 0.7337493
This model has a Quasi-global R-squared value of 0.73 . So 73% of the variance can be explained by the model. This indicates a high in-sample prediction accuracy.
We will then bind the outputs to our ken polygon so we
can map them.
results <-as.data.frame(gwr.model$SDF)
names(results)
## [1] "sum.w" "X.Intercept." "precip"
## [4] "X.Intercept._se" "precip_se" "gwr.e"
## [7] "pred" "pred.se" "localR2"
## [10] "X.Intercept._se_EDF" "precip_se_EDF" "pred.se.1"
gwr.map <- cbind(ken, as.matrix(results))
Map the local R2.
#qtm(gwr.map, fill = "localR2")
library(tmap)
tm_shape(gwr.map) + tm_fill("localR2", n = 5, style = "quantile") + tm_layout(frame = FALSE, legend.text.size = 0.5, legend.title.size = 0.6)
## Variable(s) "localR2" contains positive and negative values, so midpoint is set to 0. Set midpoint = NA to show the full spectrum of the color palette.
Display fatalities and precip
map1=tm_shape(gwr.map) + tm_fill("FATALITIES") + tm_layout(frame = FALSE, legend.text.size = 0.8, legend.title.size = 0.8)+
tm_text("County", size = 0.6, remove.overlap = T)+
tm_format("World")
map2=tm_shape(gwr.map) + tm_fill("precip") + tm_layout(frame = FALSE, legend.text.size = 0.8, legend.title.size = 0.8)+
tm_text("County", size = 0.6, remove.overlap = T)+
tm_format("World")
map1
map2